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Oriented circular cylinders in an opaque medium are used to rep¬ 
resent certain microstructural objects in steel. The opaque medium 
is sliced parallel to the cylinder axes of symmetry and the cut-plane 
contains the observable rectangular profiles of the cylinders. A one- 
to-one relation between the joint density of the squared radius and 
height of the 3D cylinders and the joint density of the squared half¬ 
width and height of the observable 2D rectangles is established. We 
propose a nonparametric estimation procedure to estimate the distri¬ 
butions and expectations of various quantities of interest, such as the 
cylinder radius, height, aspect ratio, surface area and volume from 
the observed 2D rectangle widths and heights. Also, the covariance 
between the radius and height of a cylinder is estimated. The asymp¬ 
totic behavior of these estimators is established to yield point-wise 
confidence intervals for the expectations and point-wise confidence 
sets for the distributions of the quantities of interest. Many of these 
quantities can be linked to the mechanical properties of the material, 
and are, therefore, useful for industry. We illustrate the mathematical 
model and estimation procedures using a banded microstructure for 
which nearly 90 pm of depth have been observed via serial sectioning. 


1. Introduction. One of the biggest challenges of studying materials like 
steel is the inability to see inside of an opaque medium. While there are 
methods to obtain three-dimensional (3D) information, they tend to be 
costly both in terms of time and resources. Methods like serial sectioning 
are destructive to the material and require long periods of time to collect 
a reasonable amount of data. Nondestructive methods such as synchotron 
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radiation are expensive and can only be performed at specialized laborato¬ 
ries. The discipline of stereology provides many tools to confront these is¬ 
sues in the sense that there are well established models that provide means 
of estimating various 3D quantities based on (relatively inexpensive) two- 
dimensional (2D) observations and measurements; see, for example, Mayhew 
(1991), Ohser and Miicklich (2000), Russ and Dehoff (2000). A classical ex¬ 
ample comes from a study by Wicksell (1925) where the size distribution of 
spherical corpuscles in spleens is estimated based on measuring the circular 
cross-sections from slices of the spleens. Wicksell derived the relationship 
between the distribution of the unobservable sphere radii and the distribu¬ 
tion of the observable cross-sectional circle radii. He then used the empirical 
data and a histogram estimator to solve his particular problem. 

This basic stereological model has been applied in a variety of disciplines 
where it is not possible to obtain full 3D measurements of objects simply 
by looking at them; this includes biology, geology, astronomy and materi¬ 
als science: [Cruz-Orive and Weibel (1990), Giumelli, Militzer and Hawbolt 
(1999), Higgins (2000), Jeppsson et al. (2011), Miyamoto (1994), Sahagian 
and Proussevitch (1998), Sen and Woodroofe (2012), Tewari and Gokhale 
(2001)]. Not surprisingly, the method has also gained considerable attention 
in the statistics literature. There, the main focus is on computation and 
asymptotic behavior of the proposed estimators [Cruz-Orive et al. (1985), 
Mase (1995), Sen and Woodroofe (2012), Silverman et al. (1990), van Es 
and Hoogendoorn (1990)]. 

In several applications the particles of interest are spheres, or close enough 
to be treated as such. However, in many other applications the particles 
are not spherical at all, and so it is important to also consider models 
with nonspherical particles. The basic model with spheres has been ex¬ 
tended to randomly oriented cylinders, polygons, spheroids and ellipsoids, 
and nonregular shapes [Andersen, Holme and Marioara (2008), Fullman 
(1953), Higgins (2000), Giumelli, Militzer and Hawbolt (1999), Li et al. 
(1999), Jensen (1995), Mehnert, Ohser and Klimanek (1998), Oakeshott 
and Edwards (1992), Sahagian and Proussevitch (1998), Spiess and Spo- 
darev (2011), Thouless, Dalgleish and Evans (1988)]. 

All of this has led to a large body of work from which information of 
interest to scientists, engineers and industry can be drawn. The tools that 
have been created are powerful in their versatility. They can be applied to 
real materials, to models and simulations. They can also be studied from a 
theoretical point of view. The specific motivation for this current work comes 
from banded steel microstructures, like the one shown in Figure 1. The in¬ 
dustry is interested in this particular material because it has anisotropic 
properties, high susceptibility to cracking and corrosion, and it is more dif¬ 
ficult to machine than nonbanded material. This anisotropy can arise either 
from the particular chemistry of the steel or during the rolling phase when 


NONPARAMETRIC INFERENCE FOR AN ORIENTED CYLINDER MODEL 3 



Fig. 1. Optical image of a banded steel micro structure. 

blocks of steel are flattened into sheets and rolled into coils. Currently, there 
is no reliable way to prevent or control the banding under certain necessary 
processing environments. Being able to quantitatively describe the sizes of 
the bands in 3D will greatly aid industry in assessing the quality of the 
material and the extent of the effects the bands have on the material com¬ 
ing off the production line. Ultimately, this will also aid in understanding 
and controlling the process that leads to band formation, thereby making it 
possible to eliminate them from the material when they are undesirable. 

In this paper, we propose a simple model in which we use randomly sized, 
oriented cylinders to represent the microstructural bands. Following the ex¬ 
ample set forth by Wicksell (1925) when he considered spherical corpuscles 
observed in spleens, we will consider the marginal distributions of the ra¬ 
dius and height of the cylinders. While most stereological models assume 
that nonspherical objects are randomly oriented, in this case, it is clear that 
this assumption is not appropriate. Therefore, by imposing the orientation 
constraints, we can explore other properties of the cylinders, such as the 
volume, surface area and aspect ratio. These quantities are important to es¬ 
timate because they are linked to the mechanical properties of the material. 
For example, the surface area can be linked to the interface area between two 
phases, which determines properties like strength and resistance to corrosion 
or cracking. 

In this work, we propose two nonparametric estimators for estimating the 
distributions of the 3D cylinder quantities of interest from the 2D rectangle 
observations. One estimator enforces a monotonicity constraint, inspired by 
the work of Groeneboom and Jongbloed (1995), the other does not. An em¬ 
pirical estimator is used to estimate the expectations of the 3D quantities of 
interest from the 2D observations. The rates of convergence and asymptotic 
distributions for all of these estimators are derived, which provide means 
of estimating the point-wise confidence intervals for the expectations and 
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point-wise confidence sets for the distributions when the model is applied 
to the steel microstructures. While a parametric estimator could perform 
better than the nonparametric estimators we propose here, not enough is 
known about the bands within steel microstructures to assume any partic¬ 
ular distribution for the radius and height of the cylinders. Therefore, the 
first step toward understanding this distribution is to study it nonparamet- 
rically and so this work focuses on the empirical and isotonic estimators for 
understanding the material. 

This paper is organized as follows. The cylinder model is introduced in 
Section 2. The nonparametric estimation procedures is described in Section 3 
and the asymptotic distributions and rates of convergence of the two different 
estimators are derived in Sections 4 and 5. A simulation for validation of 
the model is presented in Section 6 and, finally, in Section 7 the model is 
applied to the banded microstructure. 

2. Cylinder model. To represent the bands shown in Figure 1, the fol¬ 
lowing model is proposed (see Figure 2). Cylinders are generated with a 
joint density / for the squared radius X [the choice to look at the squared 
radius is inspired by Hall and Smith (1988)] and height H . The centers of 
these cylinders are cylinders are placed such that their axes of symmetry 
all have the same orientation, as in Figure 2(c). A cylinder with radius y/x 
will be intersected by the plane if and only if its center falls within slab S x 
as shown in Figure 2(a). This leads to biased observations on the cut plane 
since cylinders with larger radii have a higher probability of being inter¬ 
sected. More specifically, the joint cumulative distribution function (CDF) 
of (X,H), given that the plane intersects the cylinder, can be written as 

P(X <x,H< /i|cylinder hits plane) 

P(X < x, H < h and cylinder hits plane) 

^(cylinder hits plane) 

fy= 0 Ln=oVyf(y, m ) dmd y 

. 4=0 IZo y/vf(v> m ) dm dy 

I \Jyf (y> m ) dm dy. 

m =0 

Here, since the probability that the cylinder is cut is proportional to the 
radius, the density function / is weighted by the ratio of the radius of the 
cylinder, y/x, to the expected radius, Ef\y/X] = Trip, which we assume to be 
finite (see Assumption 1). Since the centers of the circles are uniformly dis¬ 
tributed throughout the medium, the distance from the center of a cylinder 
that has been cut to the intersecting plane is a uniform random variable, 


1 r x 

rrip J y= o 
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Fig. 2. Visualization of the cylinder model, (a) Top view of cylinders in an M x M x M 
box with a cut plane (dashed line) and slab S x (solid lines) into which cylinder centers 
should fall to be cut by the plane, (b) Schematic view, \fX is the cylinder radius, \fZ is 
the rectangle half-width, U is a uniform random variable, (c) View of cut plane through 
the box. (d) Observations on the cut plane. 


as shown in Figure 2(b). This is analogous to the relationship between the 
circle radii and sphere radii in the method set forth by Wicksell (1925). Once 
a cylinder has been cut, the observable portion is seen as a rectangle on the 
cut plane, as shown in Figure 2(d). 

The rectangles have observable squared half-widths, z, and heights, h , 
that have a joint density g. Since the cylinders are all cut parallel to their 
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axis, all of the height information for the cut cylinders is preserved and 
directly observable on the cut-plane. (This shows that the distribution of 
the cylinder centers along the direction of the heights does not require the 
uniform random assumption.) The half-widths of the observed rectangles 
are related to the cylinder radii through the relationship displayed in Fig¬ 
ure 2(b). From these 2D observations, one can estimate the 3D distribution 
where the relationship between g and / can be obtained using a variant of 
the well-known formula relating the density of the rectangle half-width (and 
height) to the distance of cylinder center to the cut plane and the density 
of the cylinder radius (and height): 


(1) 


f£l z (x-z) 1 / 2 f(x,h)dx 
2 f™ 0 y/xfx(x)dx 

— T / (x- z)~ 1/2 f{x,h)dx. 

ZTYlp J x =z 


This relation can be inverted to obtain the joint density for the cylinder 
radius and height as a function of the observable rectangle joint density: 


f(r h) = d fz=J z - x ) 1 / 2 g(z,h)dz 

I{,) dx f™ 0 z-Vi 9z (z)dz 

( 2 ) 

1 f) roc 

= - -X~ (z-x)~ 1 / 2 g(z,h)dz, 

m G ox J z=x 

where m G = F^Z^ 1 / 2 ] is the expectation of one over the rectangle half-width 
and is also assumed to be finite (see Assumption 1). From this relationship, 
the distributions of univariate quantities of interest such as the height H, 
the squared radius X, the aspect ratio R = y/X/H, the surface area S = 
2ir(X + y/XH), and the volume V = ttXH can be calculated. 

The CDF for the observed height takes on the form 


( 3 ) 


Fii{h)= j dt 

Jt =o 


1 rh poo 

-4 / / z~^g(z,t) 

m G Jt =o Jz =o 


dzdt. 


Note that this CDF still contains the weight associated with the bias from 
the radius of the cylinder. This accounts for any dependence that might exist 
between the cylinder height and radius. Should such a dependence exist, 
the observed rectangle height distribution will also be biased. See Figure 4 
and Section 4.4 for a more detailed discussion of the biasing of the height 
observations associated with a dependence of the height and radius. 
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For each of the other quantities of interest, define 

t, (squared radius T = A), 

( ht ) 2 , (aspect ratio T = \/X/H), 

_ n 2 

(surface area T = 2tt(X + y/XH)), 


q(h;t ) = < 


h 2 t h 

T + 2^~ 2 


t 

- 7r/i’ 


(volume T = ttXH) 


( 4 ) 

[see Appendix for a comprehensive review of the relationships between A, H, 
Z and (/(/j;f)]. These functions are chosen such that the random variable of 
interest T is such that T > f if and only if A > q(H;t ) for h,t > 0. Hence, 
using (2), 


/*oo /*oo 

F T (f) = / / 

J h =0 J x 


( 5 ) 


1 - 


/(x,/i) dx dh = , 

=0 Jx=q(h-,t) 


where A is a bounded and decreasing function that can be rewritten as 

COO COO 

(6) N(t)=N q (.. t) (t)= / ( z-q(h]t)y 1/2 g{z,h)dzdh. 

J h =0 J z=q(h;t) 


Note that (6) allows for expression of the CDF of the unobservable 3D 
cylinder properties in terms of a function N involving only the joint density 
g of the observable pair ( Z,H ). This suggests natural ways to estimate the 
CDFs of these quantities, as will be discussed in Section 3. Also note that 
under Assumption 1, 

(7) A(f) < N(0) = E g [Z~ 1 / 2 ] < oo. 


Along with the distribution functions, it is useful to estimate the expec¬ 
tations of the quantities of interest. It is especially important to be able to 
express these 3D quantities entirely as functions of the density g of the ob¬ 
servable variables (Z, H). This can be done using equation (1) with a,(3 > — 1 
(given that the moments exist), 


( 8 ) 


COO coo 

E g [Z a H^}= / / z a hPg(z,h) 

J h= 0 Jz =0 


dz dh 


s/nT(a. + 1 ) 

2 rripT(a + 3/2) 


E f [X a+l / 2 HP] 


where m,p is the same as that given in (1) and F is the Gamma function. 

From these cross-moments, another important quantity of interest can 
be calculated: the covariance between the radii and heights of the cylinders. 
From the moments given in equation (8), the following expression is obtained 
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for the covariance between the unobservable radius y/X and height H in 
terms of the observable rectangle half-width \[Z and height H: 

Co >v f (Vx, H) = a^ H = E f WXH) - E f [Vx]E f [H] 

_ (7T/2)E g [H] tt/2 E g [Z~^H\ 

Eg[Z~ 1/2] Eg[Z-y 2 ] Eg[Z~y 2 ] ' 

The stated quantities of interest associated with the density / are now ex¬ 
pressed in terms of the density g of the observable quantities. The next 
section will describe empirical and isotonic estimation procedures that can 
be used to estimate the unknown distributions and covariance. 


3. Nonparametric estimation. The main statistical problem to solve is to 
estimate the quantities defined in terms of the joint density /, as introduced 
in Section 2, based on the observed data from the joint density g. A natural 
estimator to begin with in this case is the empirical or plug-in estimator. 

Plugging the empirical distribution of the observed data pairs ( Zi,Hi ) 
(1 < i < n) into relations (3) and (6) yields 

E n y— 1/2 1 

i=L i 

^ 'H.nW = /2 

2^i =i 

as an estimator for the CDF of the heights and 

~ i n 

(11) N n (t) = iV n>?( .;i)(t) = - ^2(Zi - q(Hi;t)y 1/2 l [Zi>q{H .. t)] 

1=1 

as estimators for the various choices of N dependent on q(h;t). These esti¬ 
mators of N can be plugged into (5) to obtain the estimators for the CDFs 
of the various quantities of interest. 

The expectations of interest in equation (8) can be estimated by the 
empirical mean: 

n 

(12) E[Z a H 0 ] = ~Y^ ZfH?. 

1=1 

In this way, the covariance between \[X and H can be estimated by 

(ni n . W2)sr.,g. */2 yumEI 

^ n,y/~XH y — y—\/2 ry—\l2 

Zii =1 "j ^ Zii =1 "j 

The empirical plug-in estimator works well for estimating the covariance 
and yields a monotonic function for the estimate of the distribution function 
of the height. This is not true, however, for N n . This estimator for N, which 
in view of (5) is nonincreasing, is a nonmonotonic function; it even has 
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Estimate of volume distribution: V = ji X H 



r — 1 


- - Under ying 


Plug- n Estimate 
—Isotonic Estimate 


Plug-i 


0 , 


0 


20 


40 60 80 


v 


Fig. 3. The estimates for the underlying distribution of the volume (given by the sim¬ 
ulation in Section 6) for n = 50 cylinders. The underlying distribution is given by the 
dashed grey line, the empirical plug-in estimate is given by the solid light grey line, and 
the isotonic estimate is given by the solid black line. 

poles due to the vanishing denominator when q(Hf, t) = Z % . See, for example, 
Figure 3. Therefore, inspired by the approach of Groeneboom and Jongbloed 
(1995), we introduce an isotonic estimator, which enforces monotonicity, 
to obtain estimates for N and, consequently, the underlying distribution 
functions of X, R, S and V. 

Briefly, the isotonic estimator is the (nonincreasing) function N n that 
minimizes 



(14) 


over all nonincreasing functions on [0,oo). It is tempting to “complete the 
square” and choose to minimize the function J(N(y) — N n (y)) 2 dy instead 
of (14), which should lead to the same solution since the added constant, 
/ 0 °° N n (y) 2 dy, does not depend on N. However, N n is not square integrable, 
and so this added constant is infinite, making this problem ill defined. There¬ 
fore, we stick to minimizing (14). 

To solve the minimization problem (continuous isotonic regression), we 
use Lemma 2 from Anevski and Soulier (2011) [see also Groeneboom and 
Jongbloed (2010)], where a characterization is given for the solution of our 
minimization problem. We begin by integrating the empirical estimator in 
(11) with respect to t , yielding 



(15) 
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Then, define U* to be the least concave majorant of U n , enforcing mono¬ 
tonicity of its derivative. Finally, for t > 0, N n (t ) = Un' r (t) is the right-hand 
derivative of U* evaluated at t. 

Sections 4 and 5 will consider the rates of convergence and asymptotic 
distributions for the plug-in estimators and the isotonic estimator in turn. 

4. Asymptotic distributions of the plug-in estimators. There are a few 
assumptions on the observed variables that are required for the derivation 
of consistency and the various asymptotic distributions to hold. 

Assumption 1. 0 < E g {Z~ 1 / 2 ] < oo. Equivalently, via (1) and (8), 0 < 

E f [VX]< oo. 

Assumption 2. E g [H 5+£ ] < oo for some e > 0. 

Assumption 3. E g \Z~ x ! 2 H\ <oo. 

Under Assumptions 1, 2 and 3, the plug-in estimators for the distribution 
function of H, the quantities N(t) for X, R, S and V (for fixed t), and the 
covariance in equations (10), (11) and (13), respectively, are consistent by 
the law of large numbers. From (1), (2) and (8) it follows that the random 
variables Z -1 / 2 , PfZ -1 / 2 and [Z — q(H; t)]~ 1 ^ 2 l^z> q (H-t)] have infinite vari¬ 
ances. This means that the standard (finite variance) central limit theorem 
cannot be used to derive relevant asymptotic distributions. The theorem be¬ 
low states a central limit result for random variables with infinite variances 
that will be needed in the sequel. 

Theorem 1. Let Y.for i = 1,2,..., be i.i.d. random variables. Denote 
the distribution of Y\ by K and define Y n = ^ Y • V < 00 an d 

Pa'(T) > c) ~ ^ as c—>• oo and Ek[Y 2 l[y. g [ 0jC )]] ~ fdn(c 2 ), where k > 0 is a 
constant, then 

] ^ ) (^-E K [Y i \)^M(0,K). 

Proof. We apply Theorem 4 from Chapter 9 of Chow and Teicher 
(1988). To this end, note that because PidXi > c ) ~ an d -E'A'K 2 l[y i e[o,c)]] ~ 
Kln(c 2 ), the following condition holds: 

Mm dK(v) 
c^oo (l/c 2 )4| <c r/2 dK(y) 


= lim 


P(Yi > c ) 


™ (1 /c 2 ) E K [Y 2 1 [y. g [0)C)] ] 
= 0 . 


K 


= Um 1 ! 2 \ 
c ->oo Kln(c z ) 
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Now, choose c = \Jn ln(n)/c and define A n = -jfh I\ y \<B ydK(y) and B n = 

sup{c: !\ y \ <c y 2 dK(y) > £}. This leads to B n ~ c and A n ~ ^ifi~yE K [Yi\ 

for n—>• oo since Ek\Y(\ < oo. Consequently, the central limit theorem holds 
where, for i/£R, 

lim P ( - A n < y] = lirn P f ./ (Y n - E K [Yfi) < 

oo \B n f^ I n ->oo y y In(n)tt / 

= $(y), 

where $ is the CDF of the standard normal distribution. □ 


4.1. Asymptotic distributions for the estimators of N(t) andFft). Using 
Theorem 1, we derive the asymptotic distribution for estimators of N(t) 
for the various choices of q given in (4). We begin by defining the density 
function of the random variable Z — q(H;t ) as 

r oo 

(16) Tq (z) = T q{ .. t) (z) = g(z + q(h]t),h)dh. 

Jh=0 

Assumption 4. r' q is continuous and uniformly bounded by some M < 
oo in a right neighborhood of 0. 


If Assumption 4 holds, then (16) has the important property that for 

<U0, 

(17) f T q (z)dz = 5T q (0) + o(6). 

Jz= o 


Theorem 2. Let ( Zi,Hi ) (i = 1,2...) be an i.i.d. sequence with density 
g given in (1), t> 0 fixed, and let q be any of the choices given by (f). 
Furthermore, let N n be defined as in (11) and let Assumption 1 hold and 
Assumption 4 be satisfied for q(-;t) and g. Then 

(18) ATI(iV„(t)-JV(t))™V(0,T,(0)). 


Proof. Define the i.i.d. sequence Y\, Yi, ■.. by Y) = [Zj — q(Hi\t)\ 1//2 x 
1 [Zi>q(Hi;t)\ for i = 1,2,... with distribution function Ky. Note that N n (t ) = 
n -1 Yfl=i Y-t and E\Yf\ = N(t) < oo by Assumption 1 and (7). The tail prob¬ 
abilities of Yj behave like 


P(Y t > y) = P 


1 [Z i >g(H i ;t)] 

V z i - q{Hf,t) 
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poo /•l/y 2 +q(h;t) 


= p(q(Hi;t) <Zi< ^ + q{Hi]t)^ 
i 

g(z , h ) dz dh 
g{z + q(h\ t),h ) dz dh 


0 J z=q(h;t) 

ri/y 2 


r oo ri 

J h =0 J z= 

r oo /*1 

j /i=0 J z =0 
/■l/y 2 /-oo 

/ / 5( 

J 2=0 J h=0 


z + g(/i; f), h) dz dh 


r i / y 2 i 

= / Tq(2)£fe= — T ? (0) + 0(?T L ). 

J 2=0 y 

Applying (17) as y —>■ oo, we see that ft = r 9 (0) in Theorem 1. The expecta- 
tion of T] 2 truncated at c = \Jn ln(n)« is 


(19) 


m 2 h^[o,c)\= / y 2 dAV(y) 

•Aj=0 


'y =o 


2 y{K Y (c) - K Y (y)) dy ~ ln(c 2 )r (? (0). 


This relationship is proven in the supplemental article [McGarrity, Sietsma 
and Jongbloed (2014)]. Therefore, from Theorem 1 the result follows. □ 

By Theorem 2, the asymptotic variances for the estimators N n (t) based 
on the quantities q for the squared radius, aspect ratio, surface area and 
volume, respectively, are given by 


( 20 ) 


r oo p oo 

/ g(t, h) dh = g z (t), / g(h 2 t 2 ,h)dh, 
Jh =0 Jh =0 


/•oo 

/ 1 

Jh =0 


h 2 t h 

T + 2 ^ ~~ 2 


h ) dh and 


lh =o 


g{- h , h ) dh 


Note that for the squared radius, result (20) is not new. Since it is inde¬ 
pendent of height, this result is the same as the result stated in Theorem 2 
by Groeneboom and Jongbloed (1995) for spherical particles in Wicksell’s 
problem. However, for the other quantities of interest, which require both 
the squared radius and the height of the cylinders, the result is different 
from what can be obtained by following Groeneboom and Jongbloed’s ap¬ 
proach to the Wicksell problem. The asymptotic distributions of N n (t ) can 
be used to obtain the asymptotic distributions of the corresponding distri¬ 
bution functions of interest, evaluated at t. Note that for all choices of q in 
(4), iV n (0) = iXIU^ 1/2 and iV(0) = E’ s [Z- 1 /2] =m - = vr /(2 m +). 
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Corollary 1. Based, on the estimators N n (t ) of Theorem 2, define 
F n (t ) = 1 — N n (t)/N n (0) as estimator for Ft defined in (5). Then, under 
the conditions of Theorem 2, we have for n —>• oo 

, 21 ) 

The proof follows from Theorem 2 using Slutsky’s lemma. 


4.2. Asymptotic distribution for the estimator of the covariance. Finding 
the asymptotic distribution of the covariance estimator is more complicated 
than for any single expectation estimator. Therefore, this asymptotic dis¬ 
tribution is considered first and the results are then applied to the simpler 
estimators for the various expectations. From Assumption 2 the variance of 
H is finite. Therefore, the standard central limit theorem for finite variance 
random variables holds for the sample mean of the Hf s and we can define an 
approximating quantity for the covariance that depends only on the terms 
involving Z” 1 / 2 [compared to (13)]: 




,Vxh 


n 


(V2 )E g [H i 
_1 £"=i 


vr/2 


EILl 


- 1/2 


- 1/2 


n 


-i ■ 


E n 17—1/2 V 71 ry 

i=l 2-4=1 


- 1/2 


Note that 5“ 1 (d' ri ^x H — & n ^/x H ) —> 0, where 5 n = y Hence, to derive 
the asymptotic distribution of ~ a n Fxh)i ^ suffices to derive 

the asymptotic distribution of S~ 1 (a n ^xh ~ a n Vxh)- Considering this dis¬ 
tribution, define the function (0,oo) 2 Alas 




Ti ( E g [H] v 
2 \ u u 2 


Moreover, define 

( 22 ) 

leading to a n = 4>(T n ). In order to pin down the asymptotic variance 
of a n we need two more assumptions and the following lemma. 


T, n = 


n 


Z, 


- 1/2 


=i \HiZ~i 


- 1/2 


Assumption 5. fg = J£f 0 h j g( 0, h)dh< oo for j = 0,1,2. 

Assumption 6. For some constant K < oo, \-§^g(z, h)\ < K for all z,h>0. 










14 


K. S. MCGARRITY, J. SIETSMA AND G. JONGBLOED 


Lemma 1. Let T n be as defined in (22). Assume that Assumptions 1, 2, 
3, 5 and 6 hold, then 

(23) 5~\T n -E g [T n ])^AT{0,S) where H = f| 

and the entries in S can be formulated from (8) to yield 

E f lX- x l 2 Hi} 


(24) 


i 3 = 

S a 


poo 

/ h J g(0,h) 
Jh=0 


dh = 


2£ / pf 1 /2] 


The proof of this lemma can be found in the supplemental article [Mc- 
Garrity, Sietsma and Jongbloed (2014)]. We now apply the A-method to the 
quantity (p(T n ), which yields 

^n ( a n,'/XH ~ G VXh) = (0(^n) — <!>{Eg [T n ])) A/"(0, V ), 

where 


= (V4>(Eg[T n ]))E((V(l)(Eg[T n ])) 


and 


v) = 


d . .. 

\g-X u ' v) ■ 


7T 1 / 2v — E g [H]u 
2 u 3 1 — u 


This provides v 2, in terms of the joint densities of the observable variables: 
2 


u 2 = 


(25) 


) E-^Z- 1 / 2 ] 

E^Z-^H] _^E g [Z~ 1 / 2 H}E g [H\ 


x 4 


9 V E 2 [Z~ 3 I 2 } 


Eg \Z~ 3 / 2 } 


+ E 2 [H] 


+ 2 e g \E g [H]- 


E g [Z~V 2 H\ \ 
Eg[Z ~ 1/2 ] ) 


+ £) 


Given the cross-moment relationships in (8) and (24), z/ 2 can also be ex¬ 
pressed in terms of the underlying joint distribution of the cylinder radii 
and heights: 

-2 


n 2 =(-\ E f[ xV 2 ] 


x {E f [X~ 1 ^ 2 }(AE 2 [X l / 2 ]E 2 [H] 

(26) - 4E f [H]Ef[X 1/2 H\E f [X 1/2 ] + E 2 f [X 1/2 H}) 
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+ 2 Ef[X- 1/2 H]{Ef[X 1/2 H}Ef[X 1/2 ]-E f [H]Ej[X^ 2 ]) 

+ E f [X- 1/2 H 2 ]E 2 f [X 1/2 }}. 

This proves the following theorem for the plug-in estimator for a^/x H - 

Theorem 3. Let aand a n be defined as in (9) and (13), re¬ 
spectively. Under the assumptions of Lemma 1, for v 2 defined in (25) and 
(26), 

asn^oo. 

4.3. Estimating the expectations. From (8) and (12), it is simple to verify 
that the various 3D quantities of interest are given by the 2D observable 
quantities with their empirical estimators given in Table 1. 

Due to the dependence of the aspect ratio on iL” 1 , several more assump¬ 
tions are required to continue this analysis. For brevity and simplicity, the 
expectation of the aspect ratio will not be considered any further. 

To obtain the asymptotic distributions, Lemma 1 and the delta method 
can be used with the following assumption. 

Assumption 7. E g [Z 1 / 2 H j ] < oo and E g [(Z 1 / 2 W) 2 } < oo, where j = 

0 , 1 - 

Under Assumption 7, the expectations can be treated as constants in the 
modified function as discussed for the expectation of the height in the 


Table 1 

Expectations and empirical estimates of the 3D quantities of interest given as functions 
of the expectations and empirical estimates of the 2D observable quantities 


Quantity of interest (T) 

Expectation Ef[T] 

Empirical estimator Ef[T] 

Radius: A 1/2 

( -7r /2) {E g [A -1 / 2 ]) -1 

W2)((l/n)Er=i^" V2 )- 1 

Squared radius: A' 

(2E g [Z 1 ^})(E g [Z~ 1 / 2 })- 1 

(aEIU^XEIU^ 172 )- 1 

Height: H 

( e 9 [ z - 1 / 2 h ])( e 9 [ a - 1 / 2 ])^ 

1 (EE^-^XEEi^ -172 )- 1 

Volume: ttXH 

(2nE g [Z 1 / 2 H])(E g [Z~ 1 / 2 }) 

-^ELi^^XELi^r 172 )- 1 

Surface area: 

2 tt [(2 E g [Z^ 2 )){E g [Z-^)y 

- 1 27t[(2eiu ^ /2 ^xEr=i ^r 172 ) -1 

27r(A + A 1/2 H) 

+ (tt/2 )E g [H] 

+(V2)(Er=i^) 


x (2E 9 [A- 1 / 2 ])- 1 ] 

x (ELi z- 1 ' 2 )- 1 ] 

Aspect ratio: 

( n Eg[H~ 1 ])(Eg[Z~ 1 l 2 })~ 1 


A 1/2 H~ 1 
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Table 2 

Asymptotic variances u 2 from Corollary 2 


Quantity of interest 

Asymptotic variance u 2 

Radius 

(tt/2 ) 2 £°(£ b [Z- 1/2 ])- 4 

Squared radius 

4 ^(F g [Z 1/2 ]) 2 (E 9 [Z- 1/2 ])- 4 

Height 

{efiEgiZ-^H}) 2 - iZlEglZ-'F^EgiZ- 1 ' 2 } 
+ e 9 (E g [Z- 1 / 2 ]) 2 }(E g [Z- 1 / 2 ])- 4 

Volume 

4n 2 e 9 (E g [Z 1 ' 2 H}) 2 (E g [Z- 1 / 2 ])- 4 

Surface area 

C°(4t rE^Z 1 / 2 ] + n 2 E g [H}) 2 [Eg[Z - 4 / 2 ])- 4 


previous section. The coefficients s and t for linearizing (22) are taken to be 
zero where appropriate. Then, the asymptotic variance for the estimation of 
the quantities of interest given above is listed in Table 2. 

This leads to the following corollary to Theorem 3. 

Corollary 2. Let Ef[T] and Ef[T] be defined as in Table 1, where T 
is any of the quantities of interest listed in Table 1. Under the assumptions 
of Lemma 1 and Assumption 7, for v* as defined in Table 2, 

J^{E f [T}-E f [T])^M{f)y g ) asn^oo. 

Theorem 3 and Corollary 2 show that the expectations of the quantities 
of interest can be estimated consistently with a rate of y / ln(n)/n. These 
results can be used to obtain the 95% confidence intervals for the unknown 
expectations being estimated by Ej[T]: 

(27) E f [T\± 

For a discussion on the small sample properties and coverage probability of 
the confidence intervals, see Chapter 4 of McGarrity (2013) and the supple¬ 
mental article [McGarrity, Sietsma and Jongbloed (2014)]. 

4.4. Asymptotic distribution for the estimator of the height distribution. 
Consider the plug-in estimator for the distribution function of heights, given 
in (10). As mentioned before, under Assumption 2, the law of large num- 

bers immediately gives that Fn^nifi) —> Fjj{h) as n —>■ oo. The asymptotic 
distribution is given in the theorem below. 
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Theorem 4. Consider Fjj{h) and FH, n [h ) as given in (3) and (10), 
respectively. Under Assumptions 1 and 5, 

\[^—(F Hn (h) - F H (h )) Af(0, v 2 ), 

V Inn 

where u 2 = (m^y 2 (F H (h) g{0,y)dy + (1 - F H (h)) f Q h g{0,y)dy). 

Proof. Consider the random vectors 

A ,/2 \ 

zr v V<J 

with 

/ m G \ 

™=(r f\-v> gMdydz - 

\ J z = 0 Jy=0 ' 

For T n it is shown in the supplemental article [McGarrity, Sietsma and 
Jongbloed (2014)] that 

(28) J^{T n -E[T n ])~*M(0,E), 

where the entries of E are given by (\2 = £21 = £22 = /jLo 5(0) v) dy an( i 
^11 = <7z(0). The result follows by applying the A-method to the function 
(j)(u,v) = v/u at T n , yielding asymptotic normality with variance v 2 . □ 

The estimator for the distribution of the heights given in (10) accounts 
for any dependence between the radius and height of the cylinders. Any 
correlation that might exist will lead to the height observations being biased 
like the rectangle half-width observations due to the larger cylinders being 
more likely to be intersected by the cut plane. However, if the heights are 
known to be independent from the cylinder radius, then the biasing in the 
problem has no consequences for the distribution of observable heights and 
we may simply take the empirical distribution of the observed heights to be 
the estimate of the actual distribution: 

1 n 

Fh ^ = 

i=1 

This estimator has a rate of convergence of 1 /y/n. Figure 4 shows the effect 
of the rate of convergence for the estimation procedure. Focusing on the left 
images, the upper image shows the 2D (light grey line) and 3D (dark grey 
line) empirical distributions for the heights of 500 uncorrelated (radii and 
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Fig. 4. The upper and lower figures show the estimate of Fn(h) forn = 500 and n = 5000 
cylinders, respectively. The dark grey lines show the 3D empirical distributions of the 
cylinder heights. The light grey lines show the 2D empirical distributions. The black lines 
show the estimates of the 3D distributions as calculated from (10). The left images are of 
cylinders whose height and radii are uncorrelated and the underlying distribution of the 
height is shown by the grey dot-dashed line. The right images are of cylinders whose height 
and radii are correlated. 


cylinder heights), uniformly distributed cylinders. The bottom image shows 
the same for 5000 cylinders. The black solid line shows the estimation of 
the 3D distribution as calculated from (10). The empirical distribution is 
a better choice than (10) in this case because it has the faster rate of con¬ 
vergence. Contrarily, focusing on the right images where there is a nonzero 
correlation between the radii and heights of the cylinders, the biasing in 
the 2D distribution (light grey lines compared to the dark grey line for the 
3D empirical distribution) is clear. In this case, the estimator from (10) is 
necessary to accurately estimate the underlying 3D distribution. 
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5. Asymptotic distributions of the isotonic estimators. In this section we 
study the consistency and asymptotic behavior of the isotonic estimators, 
N n , as described in Section 3. To do so requires one further assumption. 

Assumptions. / 0 °° N(t)dt < oo. 

Theorem 5 . Suppose t > 0 and Ft from (5) has a density f that is 
strictly positive and continuous in a neighborhood of t (a right neighborhood 
if t = 0) and that q(h\t ) is defined as in (f). Further, suppose that Assump¬ 
tions 1, 4 and 8 hold. Then, 

(29) 

as n —> oo. 


The proof of this theorem can be found in the supplemental article [Mc- 
Garrity, Sietsma and Jongbloed (2014)]. The striking difference with Theo¬ 
rem 2 is the factor 1/2 in the asymptotic variance. This means that enforcing 
monotonicity in the estimator improves on the empirical estimator because 
the resulting estimator satisfies the natural monotonicity constraint. More¬ 
over, it also leads to a more accurate estimator asymptotically. 

Analogous to Corollary 1, we have the following corollary. 


Corollary 3. Suppose that q(h;t ) > 0 for all h and t > 0, and that 
Fr(t) has a density f which is strictly positive at t and continuous in a 
neighborhood oft. Then, under the assumptions of Theorem 5, 



as n —>• oo. 


Ft ft ) ) A/"( 0, 


N(0) 2 r q (0) + Nftfgzif)) 

2A(0) 4 


The proof of this corollary is analogous to the proof of Corollary 2 given 
in Groeneboom and Jongbloed (1995), in our case applying Theorem 5 from 
above. Recall that consistency at zero follows from Lemma 1 in the supple¬ 
mental article [McGarrity, Sietsma and Jongbloed (2014)]. 

6. Simulation. To validate the model and estimators, we implemented 
a simulation where we work directly with the distributions of X and H to 
calculate the distributions of Z and H for the rectangles, as well as the other 
quantities of interest. To start, X is taken to be Gamma(3) distributed and 
H , given X = x, is triangularly distributed on [0,x]: 

fx(x) = \x 2 e~ x , x>0, 

2 

fH\x{h\x) = -^(x-h), he(0,x). 


(31) 
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Table 3 

Results of the covariance estimation for the simulation, n is the number of observed 
rectangles on the cut plane. o n yx H is the covariance estimate given in (13). Og is the 
asymptotic variance determined from a single simulation run based on (25) using the 
empirical means as estimates for the expectations in the equation. The fourth column 
gives half the width of the constructed 95% confidence interval for the covariance. The 
fifth column gives the empirical mean from 1000 simulation runs of the covariance 

estimate 



Covariance estimator and the asymptotic 

variance 

n 

a n,VXH 


1.96, 

y n & 

i y ^ 1000 ~ 
lOOO Z-^i=1 <T i,n,VXH 

50 

0.424 

1.11 

0.58 

0.266 

500 

0.331 

1.12 

0.23 

0.277 

5000 

0.262 

1.58 

0.10 

0.277 

50,000 

0.273 

1.53 

0.04 

0.276 

oo 

0.277 

1.50 

- 

0.277 


From the above, marginal and conditional densities of the observable quan¬ 
tities can be calculated: 


9 z(z) = ^(z 2 + z + §)e ~, z>0, 

, , ,, , , 2(1/2 +z-h) 

(32) g H \z(h\z) - ( - z 2 + z + 3/ 4 ) 1 [o<h<*] 

2[(1/2 + z - h)I G { 1/2, (h - z)) + Vh^e~^} 

^(z 2 + z + 3/4) [h>z 

where Ig{iti,x) = f^ x t m ~ 1 e~ t dt is the incomplete Gamma function. From 
the joint densities, the underlying distributions for the various quantities of 
interest (V, S and R) can be calculated. As an example, the distribution 
function for the volume is as follows: 


(33) F v (v) = 1 






where Ei(x) = e~ u u~ 1 du is the exponential integral. For this simula¬ 
tion, we draw n observations from the marginal density gz- For each ob¬ 
servation from Z. a corresponding height observation is drawn from the 
conditional density gn\z t° form the 2D observations (Z \, H i), ..., (Z n , H n ) 
for the n rectangles. From these observations it is possible to estimate the 
various quantities of interest for the cylinder, beginning with the covariance 
between the cylinder height and radius. 

Table 3 shows the results for the estimation of the covariance of \/X and 
H as calculated from the 2D observations. The first column indicates the 
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number of observed rectangles on the cut plane. For n = oo, the true under¬ 
lying covariance and asymptotic variance are given. For this simulation, the 
underlying covariance, as calculated from (9) and (31), is 0.277, and the true 
underyling asymptotic variance, as calculated from (25) and (31), is 1.50. 
The second column gives the estimates of the covariance for a single simula¬ 
tion run. The third column gives the estimate of the asymptotic variance for 
the covariance estimator for a single simulation run. The asymptotic vari¬ 
ance was estimated from the empirical means for the expectations in (25) 
and by using the following estimator for (24): 

1 n 

W (l = rjX H1 ‘ W Zi >- 

i=l 

where b n ~ n -1 / 3 is a cutoff value for approximating z = 0 and can be shown 
to have an optimal vanishing rate for the MSE of n -2 / 3 (see Chapter 3 
of McGarrity (2013) for details of the MSE and the supplemental article 
[McGarrity, Sietsma and Jongbloed (2014)] for a discussion on the affects 
of the choice of bandwidth). The fourth column gives the half-widths of the 
constructed 95% confidence interval for the covariance using the estimators 
for the covariance. 

The final column shows the empirical mean over 1000 simulation runs 
of the covariance estimate using the 2D observations. The results behave 
as expected. While the single simulation runs at small n give large values 
for the covariance estimate, the true covariance falls within the constructed 
95% confidence interval. As n increases, the estimated covariance approaches 
the true covariance. For the mean of 1000 simulation runs, we see that the 
estimated value of the covariance is much closer to the expected value, even 
for small n. This demonstrates the consistency and unbiased nature of the 
estimator. 

It is also possible to estimate the underlying distribution functions, such 
as that given in (33), and compare the empirical distribution function of the 
quantities of interest also based on the 2D observations ..., ( Z n , H n ) 

treated as if they were distributed as the (X,H). This means, for example, 
that for (33), V) = ttZ 1 H 1 . This is done because often only 2D data is acces¬ 
sible and has often been justified as a reasonable approximation for the 3D 
data. In the case of the squared radius, including the 2D data in this way 
emphasizes the bias inherent in the observations. For the volume, surface 
area and aspect ratio, the bias still exists. The question is whether the pro¬ 
posed estimators provide a better estimate than using the 2D data straight 
up, and, compared to the confidence intervals, it would seem that indeed it 
is. 

Applying both the empirical and isotonic estimation procedures to the 
generated data sets leads to the results for the estimation of the aspect 
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Estimate of aspect ratio distribution: R = X 1/2 /H 




Estimate of surface area distribution: S = 2lt(X+X 1/2 H) 
1 


---Underlying 
Empirical (2D) 
—Isotonic Estimate 


Estimate of volume distribution: V = It X H 



Fig. 5. Plots of the cumulative distribution functions for the aspect ratio, R, surface area, 
S, and volume, V, for n = 500 observations of ( Z, H) drawn from the 2D distributions in 
(32). In all figures, the dashed dark grey line gives the underlying distribution, the light 
grey line gives the empirical distribution based on the 2D observations ( Z,H ), and the 
black line gives the isotonic estimation of the distribution of the quantity of interest based 
on the 2D observations. The grey diamonds give approximate 95% point-wise confidence 
sets for the isotonic estimator. 

ratio (left), the surface area (middle) and the volume (right) displayed in 
Figure 5. The underlying distribution is given by the dashed dark grey curve. 
The empirical distribution based on the 2D observations [as if the (Z, H ) 
were distributed as the ( X , H)\ is given by the light grey curve. The isotonic 
estimate of the distribution of the quantity of interest based on the 2D 
observations is given by the black curve. The 95% point-wise confidence sets 
for the isotonic estimator are given by the grey diamonds. 

The point-wise confidence sets are calculated from the results of Corrol- 
lary 3. To obtain an estimate of the asymptotic variance u g given in (30), 
only the function r q ( 0) = f h=0 g(q(h;t), h) dh needs yet to be estimated. The 
estimates for N( 0) and N(t) can be obtained from the isotonic estimates 
described in Section 3. The function gz( 0) = and can be estimated by 
(34). Following the same idea as this estimator, and without going into the 
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Fig. 6. Figure (a) shows a view of the 3D reconstruction of the banded micro structure 
from the serial sectioned images. Figure (b) shows the bounding boxes around the features 
of interest (heretofore referred to as rectangles and cylinders for the 2D and 3D objects, 
resp.) in the micro structure. 

asymptotic behavior, we can estimate 0) consistently as 

1 n 

%(°) = h-buM^ ~ 9 (^ 5 *))■ 

2=1 

The underlying distribution is mostly within the 95% point-wise confidence 
sets indicating that the estimator is reasonable and can be used in practice. 

7. Application of the model to real microstructures. The model and 
estimation procedures are now applied to the banded steel microstructure 
shown in Figure 1. To obtain 3D information about the microstructure, the 
material was serial sectioned, providing images approximately every 2 pm 
into a depth of about 90 pm. For details on the experimental procedure see 
McGarrity, Sietsma and Jongbloed (2012b). The optical images were pro¬ 
cessed with dilation and closing image operations on binary thresholds. The 
serial sectioned images were combined to form a single 3D object, shown in 
Figure 6(a), and the bounding boxes, that is the smallest box that contains 
all voxels of the object being considered, around the 3D features of inter¬ 
est (heretofore referred to as cylinders) were found using the 3D analysis 
function in Fiji [Bolte and Cordelieres (2006)]. From Figure 6(a) it is clear 
that the 3D data is incomplete. The sectioning depth was not sufficient to 
observe a cylinder in its entirety. This gives a clear indication of why us¬ 
ing this model to estimate the distributions of the quantities of interest is 
so important. Using even one of the section images, like the one shown in 
Figure 6(b), can provide a reasonable estimate for the underlying 3D distri¬ 
bution that is costly to obtain directly. Figure 6(b) shows rectangles around 
the 2D features of interest. These are the smallest rectangles to fully contain 
the objects of interest and are called bounding boxes. These rectangles were 
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Table 4 

Results for the moment and covariance estimates 
of the microstructure data with 179 rectangle 
observations. The first column gives the 
estimated quantity. The second gives the estimate 
of that quantity with the half-widths of the 
constructed 95% confidence intervals using the 
estimates for the asymptotic variance 


Moment and 

covariance estimates (n = 179) 

Quantity 

2D estimate ±1.96 \J 

E[VX] 

10.55 ± 2.30 urn 

E[X ] 

125 ± 27 pm 2 

E[H] 

8.72 ±0.56 pm 

m 

1434 ± 332 pm 2 

E[V] 

4370 ± 950 pm 3 

a VXH 

11.1 ±25.6 < 0 pm 2 


found using Fiji software [Rasband (1997-2009)] and yield the observed data 
pairs (Z, H) used in the estimation procedures. For a discussion on using 
bounding boxes to represent the rectangles and how it affects the results of 
the model, see Chapter 6 of McGarrity (2013). 

Table 4 gives the 2D estimates for the moments and covariance, and the 
half-widths of their constructed 95% confidence intervals. The second column 
of the table gives the estimates using (12) and (13) for the moments and 
covariance based on a single 2D estimate for the asymptotic variances of the 
moments from Table 2. For the covariance, the estimates for the asymptotic 
variance come from (25). 

Using the 2D data set from any single slice of the serial sectioning, we 
can apply the model and estimation procedures to find the CDFs of the 
various quantities of interest. Figure 7 shows the results of the estimation 
procedures. The upper left plot shows the results for the isotonic estimation 
of the squared radius distribution. The upper right plot shows the plug¬ 
in estimation results for the height distribution. The middle plot and the 
lower left and right plots show the results for the isotonic estimation of the 
distributions for the volume, aspect ratio and surface area, respectively. In all 
plots, the light grey lines show the empirical estimates obtained by treating 
the rectangle squared half-width and height as if they were the squared 
radius and height of the cylinder. The black lines are the isotonic estimation 
results for the underlying distribution functions of the quantities of interest 
given the 2D observations. The grey diamonds give the asymptotic point- 
wise confidence sets for the isotonic estimates of Fp taken at the values of 
t corresponding to the 2D observations. These bands are calculated from 
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Estimate of squared radius distribution: X Estimate of height distribution: H 



Estimate of volume Distribution: V=n X H 



Estimate of aspect ratio distribution: R = X l/2 /H 



Estimate of surface area distribution: S= 71 (X + X 1/2 H) 



Fig. 7. Results of the model and estimation procedures applied to the microstructure 
shown in Figure 6. The number of observations is n= 179. The light grey lines are the 
estimates obtained by treating the squared half-width and height of the bounding box as 
if it were the squared radius and height of the cylinder. The black lines give the isotonic 
estimations of the underlying distribution functions of the quantities of interest given the 
2D observations. The grey diamonds give the asymptotic 95% point-wise confidence sets 
for the isotonic estimates. 
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the results of Corollary 3 and Theorem 4. For the asymptotic variance of 
the height distribution, the estimator F-^ n is used. An estimator for the 
integrals again follows the same idea as the estimator for and, without 
considering the asymptotic behavior, we obtain 

7 —X] 1 [o,M( Zi ) 1 [Aoo ](#*)-»• / 9(0,y)dy, 

o n n , =1 J y= h 

7 —y^lo.M^Ulo/ g(0,y)dy. 

o n n ^7 4 y =o 

As can be seen from the plots in Figure 7, using the empirical 2D distri¬ 
butions tends to overestimate the small values and underestimate the large 
values of the quantities of interest. The 2D empirical distributions do not 
provide a reasonable picture for the distribution of the 3D quantities of in¬ 
terest. The exception to that, of course, is the height distribution. Due to the 
potential correlation between the radius and height of the cylinders, shown 
by a nonzero covariance between them in Table 4, there appears to be a 
small bias in the 2D observations, leading to a slight underrepresentation 
of the larger height values, yet it is still encompassed within the point-wise 
confidence sets. The results of the estimates for the covariance of the cylinder 
radius and height, the estimates for the various moments and the isotonic 
estimates for the CDFs of the 3D quantities of interest provide a glimpse into 
the microstructure that cannot be reliably obtained from the serial sectioned 
data. 

8. Discussion. Often, it is difficult to know about the full 3D nature 
of the material or object being studied. The methods available to obtain 
3D data about a material tend to be expensive in terms of resources and 
time, destructive and limited to small length scales. For instance, the total 
attained depth from several weeks of serial sectioning was about 90 pm 
for the microstructures shown in this work, while many of the cylinders 
are seen to be significantly larger than that. The serial sectioning is not 
enough to view a cylinder in its entirety through the depth of the sectioning. 
Therefore, in industry in particular, most information about a material is 
based upon 2D observations, which in many cases is insufficient. Stereology 
was developed to address this issue and to find ways to extract information 
about the 3D nature from the 2D observations. However, in order to be 
able to do this, certain assumptions must be made about the objects being 
studied. In the case of the Oriented Cylinder Model introduced in this work, 
the assumptions are that the objects in the material can be represented 
by circular cylinders whose axes of symmetry are all oriented in the same 
direction and that the cut through the material is along that axis. It is 
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also assumed that the cylinders are uniformly distributed throughout the 
material. While this model is simple and the assumptions are somewhat 
ideal, our observations suggest that this is a reasonable starting point upon 
which more complex models can be built. The Oriented Cylinder Model 
provides insight into the material that has, until now, been lacking. 

Assuming the model assumptions are reasonable, estimators are used to 
obtain estimates of the unknown underlying distributions of various quan¬ 
tities of interest. Since so little is known about the material studied in this 
work, nonparametric estimators were chosen rather than parametric ones 
since not enough is known about the material to assume a specific distribu¬ 
tion. While parametric estimators will have a better rate of convergence and 
smaller variance, the difference is of order yTn(n). The flexibility afforded 
by the nonparametric model makes up for this difference. 

The results presented especially for the microstructure data in Section 7 
are informative, given how little information is available for the 3D nature of 
the material. However, there are several considerations, particularly inherent 
to processing the images, that have not been considered in this particular 
work. Edge effects are not accounted for in this analysis. The cylinders are 
considered to be completely inbounds of the observation window. However, 
it is possible that cylinders ending at the edge of the image continue be¬ 
yond and this is not accounted for in this analysis. While edge effects can 
be eliminated from the simulation results presented in Section 6, they can¬ 
not reasonably be ignored for the microstructure. Features of interest like 
microstructural bands often deviate from perfect cylinders and are not ob¬ 
servable as perfect rectangles. This leads to challenges in defining the dimen¬ 
sions of the observed rectangle. In this work, the bounding box around the 
feature of interest was taken as the rectangle. However, using the bounding 
box leads to overestimation of the heights and squared radii, though the sig¬ 
nificance of this overestimation is not immediately known. Determining an 
object of interest in an image is often done through pixel connectivity. Even 
though the images have undergone morphological processing, as described 
in McGarrity, Sietsma and Jongbloed (2012a), it is not always possible to 
preserve the true connectivity of the objects. How this affects the outcome of 
the estimation under the model assumptions is also not immediately clear. 
These issues are important to consider, but are beyond the scope of this 
particular work. 

Despite these issues, and the simplicity of the model, the estimated dis¬ 
tributions for the 3D quantities of interest are practicable representations 
of the underlying distributions. As a first step toward understanding and 
modeling a full 3D microstructure, this work provides a solid starting point 
and a reasonable approximation to what is often not directly observable. 
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APPENDIX: RELATIONSHIPS FOR THE QUANTITIES OF 

INTEREST 

First, define the quantity of interest, squared radius, aspect ratio, surface 
area or volume, as t. Let (u, h ) be the observed pair of variables. For a fixed 
h > 0 we can define t = p(h ; u) for each quantity of interest. In (4) the inverse 
of p(h;u) is defined as u = q(h;t). These can each be calculated as follows: 

u (squared radius) 


p(h;u ) = < 


(35) 


u 

h 

2tt(u + hy/u) 
irhu 


(aspect ratio) 
(surface area) 
(volume) 
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h? t h 

T + 2 ^ ~~ 2 

t 

irh 


1 2 


= q(h;t). 

It is important to note for all choices of p(h] u ) and q(h; t ) that p(h ; q(h] t)) = 
t and q(h;p(h ; u)) = u. 

The derivative of these functions with respect to the second argument is 
also important. Denoting this partial derivative of p with respect to u by p 
and the partial derivative of q with respect to t by q results in 


p(h]u) = < 
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= q(h;t). 

Considering the relationship between p and q and using the linear approxi¬ 
mation of q near t yields 

p(h; q(h\ t + e/q{h ; t))) - p(h ; q(h ; t)) 


P(h;q(h;t))= lim- 


= lim 

e4.o 


t + e/q(h;t ) — t 


1 


q(h;t)' 

Finally, note that y > q(h;t ) if and only if t <p(h\y). Recall the expres¬ 
sion for W n can be written in terms of the function (j) nv . We can use the 
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substitution u = q(h ; y) in the definition of <^> UtV and obtain, for 2 and h 
fixed, 



(36) 
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SUPPLEMENTARY MATERIAL 


Supplement to “Nonparametric inference in a stereological model with 
oriented cylinders applied to dual phase steel” 

(DOI: 10.1214/14-AOAS787SUPP; .pdf). Proofs for equation (19), Lemma 1, 
relation (28) and Theorem 5, discussion of coverage probabilities for equa¬ 
tion (27), and discussion of equation (34). 
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